The fixation index (FST) is a measure of population differentiation, genetic distance, based on genetic polymorphism data, such as single-nucleotide polymorphisms (SNPs) or microsatellites. It is a special case of F-statistics, developed in the 1920s by Sewall Wright.
Contents |
The fixation index, FST, is simply a measure of the diversity of randomly chosen alleles within the same sub-population relative to that found in the entire population. It is often expressed as the proportion of genetic diversity due to allele frequency differences among populations.[1]
This comparison of genetic variability within and between populations is frequently used in the field of population genetics. The values range from 0 to 1. A zero value implies complete panmixis; that is, that the two populations are interbreeding freely. A value of one would imply the two populations are completely separate.
Several definitions of FST have been used, all measuring different but related quantities. A common definition is:[2]
where and represent the average number of pairwise differences between two individuals sampled from different () or the same () population. The average pairwise difference within a population can be calculated as the sum of the pairwise differences divided by the number of pairs. Note that when using this definition of FST, the value should be computed for each population and then averaged. Otherwise, random sampling of pairs within populations put all the weight on the population with the largest sample size.
The measure FST has been heavily criticized as a measure for differentiation and it has been suggested that D is a better measure on both statistical and theoretical grounds.[3][4] However, its continued use has been supported under certain circumstances[5].
Differentiation can be assessed using online tools, some of which are listed below.
The International HapMap Project estimated FST for three human populations using SNP data. A more complex formula for FST was used in order to account for differences in sample size:
In the above equation xij is the estimated frequency (proportion) of the minor allele at SNP i in population j, nij is the number of genotyped chromosomes at that position, and nj is the number of chromosomes analysed in that population. The lack of the j subscript in the denominator indicates that statistics ni and xi are calculated across the combined data sets.
Across the autosomes, FST was estimated to be 0.12. The significance of this FST value in humans is contentious. As an FST of zero indicates no divergence between populations, whereas an FST of one indicates complete isolation of populations, Anthropologists often cite Lewontin's 1972 work which came to a similar value and interpreted this number as meaning there was little biological differences between human races.[6] On the other hand, while an FST value of 0.12 might be lower than that found between populations of many other species, Henry Harpending pointed out that this value implies on a world scale a "kinship between two individuals of the same human population is equivalent to kinship between grandparent and grandchild or between half siblings".[7]
Europe (CEU) | Sub-Saharan Africa (Yoruba) | East-Asia (Japanese) | |
---|---|---|---|
Sub-Saharan Africa (Yoruba) | 0.153 | ||
East-Asia (Japanese) | 0.111 | 0.190 | |
East-Asia (Chinese) | 0.110 | 0.192 | 0.007 |
Palestinians | Greeks | Italian | Spanish | Basque | Irish | German | Russian | |
---|---|---|---|---|---|---|---|---|
Greeks | 0.0057 | |||||||
Italian | 0.0064 | 0.0001 | ||||||
Spanish | 0.0101 | 0.0035 | 0.0010 | |||||
Basque | 0.0199 | 0.0098 | 0.0084 | 0.0060 | ||||
Irish | 0.0170 | 0.0067 | 0.0048 | 0.0037 | 0.0086 | |||
German | 0.0136 | 0.0039 | 0.0029 | 0.0015 | 0.0079 | 0.0010 | ||
Russian | 0.0202 | 0.0108 | 0.0088 | 0.0079 | 0.0126 | 0.0038 | 0.0037 | |
Swedish | 0.0191 | 0.0084 | 0.0064 | 0.0055 | 0.0100 | 0.0020 | 0.0007 | 0.0036 |
|